It’s a lovely morning, and you are jogging along the lakeshore, along with many others (all in the same direction). Albert is the median runner (that is, he runs at the median speed). Betty is the average runner (she runs at the average — i.e. the mean — speed.)
You notice that the number of runners you pass is exactly equal to the number of runners who pass you.
Can you determine whether you’re running faster or slower than Albert? What about Betty?
Edited to add: When I said you were running “along the lakeshore”, I was envisioning the shore of Lake Michigan; i.e. I meant to say that you’re running, effectively, in a straight line, not a circle. Obviously I should have made this clearer. But it’s a good puzzle either way!
You’re running the same speed as Albert. Since Albert is the median, he should be in the middle and if there are the same number of runners you pass as there are ahead of you, you should be with Albert.
You can’t for Betty. If the runners ahead of you are going really fast her average speed will be skewed.
You can’t. Just because you have been passing same number of people as have been passing you until now doesn’t mean the situation will not change. There could for example be a lot of fast runners up front that you’ll never catch skewing the median
You are running the same speed as Betty, the average speed.
[SPOILER]
Let f(x) denote the probability distribution of runners overall. Let s denote your speed (say in meters per minute). Let K denote the average number of total runners per meter.
Then the expected number you pass in the next minute is:
K*integral(0,s){(s-x)f(x)dx}
The expected number who pass you in the next minute is:
K*integral(s,infinity){(x-s)f(x)dx}
Setting these equal and doing some algebra:
s*integral(0,infinity){f(x)dx} = integral(0,infinity){x f(x) dx}
So s=E(x) (Because f(x) is a density and integrates to 1)
Assuming there are no “tricks” — that everyone runs at constant speed, that you’re jogging in a circular loop and everyone stays running long enough that a faster runner will always eventually pass a slower one, that you don’t count the same person twice if you pass them twice, etc. —
If the number of runners faster than you is the same as the number slower than, you, then you must be *one of* the clump of runners running at the median speed. Not necessarily the only one, in fact, Albert must be one of the others, so you’re running at the same speed as Albert.
You can’t determine it for Betty. If the speeds are 1, 1, 2, 2, 3, 9 with you and Albert being the 2’s, then Betty is the 3 (the mean) and she’s running faster than you. On the other hand if the speeds are 1, 7, 8, 8, 9, 9, with you and Albert being the 8’s, then Betty is the 7 (the mean) and she’s slower than you.
@ed I don’t think probabilities have anything to do with it because we’re talking about the number of joggers who pass you, *ever* (so the faster jogger is always guaranteed to pass the slower jogger). The problem doesn’t say we’re only looking at passes in the next minute.
Ed, unless “mean speed” has some meaning I dont understand, that can’t be right. Zak is running at 100mph, Yollande at 27.5, me at 4, Will at 3 and Vera at 3. I pass two, and am passed by two, but Yollande is the one running at,mean speed.
Zak is running at 100 mph, Yolande at 27.5, Will at 5, me at 3, Vera at 2. Zak, Yolande and Vera started before me and Will after me.
Will will pass me and I will pass Vera. I am slower than both mean and median. It’s possible to create a mirror example where I am faster than both.
Therefore nothing interesting can be said about my speed in relation to mean and median.
I think it is key whether you are on a circular track. I had assumed a linear race when I first read it – i.e. a very big lake. Just to check, do the faster runners catch up again with the slower runs as they go round?
I think you can’t determine this, because it’s not known where in the pack I started. If I’m first off the mark and I pass nobody and nobody passed me, I win. If I’m at the very back of the pack, and I pass nobody and nobody passes me, I become last.
Please note that in both cases the numbers don’t matter: if I’m first over the starting line, and 5 people pass me and I pass 5 people, I’m still first. But that implies that speeds will vary over the course of the race.
Well, if we are on a circular track and the property holds for infinite amount of time, then median=mean=me.
@Harold
It’s really about the math. Ignore the word problem description of jogging on the lakeshore, i.e. by “pass”, it’s safe to assume it simply means “faster”. If you try to consider the “real world” scenario, it’s going to be impossible to determine, e.g. you could be faster than someone but not pass them because you started ahead of them on a linear path.
I agree with Bearce’s first reply, which means it probably is wrong :-)
You may feel like you are at the median (Albert’s) speed.
But clearly that is wrong, because someone’s chance of passing you or being passed by you is weighted by the difference between your speeds.
Once weighting is involved, intuitively the answer is likely to be the mean (Betty’s) speed. I’ll leave it to Ed to do the maths.
Intuition again, but I suspect a circular track messes up that simple solution again, by giving super weighting to outliers. Perhaps on a circular track you can infer you are at Charlie’s (mean-squares) speed?
I am Albert
There are 15 runners, I am passed by 7 and I pass 7 so I finish in the middle.
You’re the median runner, so will be running at the same speed as Albert.
I agree with @Pavel that in the limit (given a circular track and an infinite amount of time) the speeds of the runners must be symmetrical about the median, otherwise, given enough time, the ‘fast’ runners would pass you more frequently than you pass the ‘slow’ runners breaking the equal passing rule.
Therefore if the speeds are symmetrical about the median the mean must equal the median also.
So I’m running the same speed as both Albert and Betty.
Suppose A is running at 99 laps/hour, B is running at 1 lap/hour and I, C, am running at 50 laps/hour – both mean and median.
A is gaining on me at 49 laps per hour, so overtakes me 49 times per hour. Similarly, I overtake B 49 times per hour.
Now I increase my speed to 98 laps/hour. Now A only overtakes me once an hour, but I overtake B 97 times per hour. But I am still running at the median speed.
So, in this case, the number of passers and passees being equal corresponds to the case where I run at the mean, but not the case where I run at the median-which-is-not-also-the-mean.
This is quite different from the case of s straight track where I will always overtake anyone ahead of me and slower than me exactly once and be overtaken by anyone behind me and faster than me exactly once.
I don’t think you can say for sure without making some assumptions. If that’s the case then I’m going to assume that I know Albert and Betty and can see where they are which makes it trivial to determine whether I’m running faster or slower than them.
Those who say you’re the median runner: this would be true if the number of named, individual, runners passing you was the same as the number being passed. But I don’t think that’s what the question says. For example if there is one exceptional athlete who passes you 10 times while all those who you’ve passed have only been overtaken once then you’re definitely not the median runner.
From playing with some example distributions in a spreadsheet, I think that the answer must be that you’re the mean runner, though I’m struggling to prove it.
So for I think you are betty (or going at the same speed as her.) Reasoning goes like this. Imagine everyone gets slower by the mean speed. Now some people are going backwards and some forwards. By definition the average speed is now 0. Some guys are running backwards (negetive speed). The total speed however is 0. So the number of laps covered total is 0. Hence total forward laps=backward laps (relative to some landmark), and betty is stationary. Ergo as many people pass her as she passes.
Assume that runners are spread evenly along an infinite track, each running at a steady speed. Anyone running slower than me I will eventually overtake. Anyone running faster than me will eventually overtake me. Since the numbers are the same my initial conclusion was that I am the median runner, the same number faster than slower.
BUT…if all the faster runners were travelling lots faster than me, but all the slower runners were travelling only a tiny bit slower, then I would overtake very few slower ones, but lots of faster ones would overtake me, so my initial conclusion seems to be wrong. In order to overtake the same number of slower runners, I think I would have to be travelling at the mean speed – the number of slower runners would need to be much larger than the number of faster runners. Albert would be in the slower runner group.
However, if the faster runners were only slightly faster than me, but the slower runners were positively snail-like, then Albert would be in the faster runners group.
I conclude that you cannot say anything about Albert from the information given.
Is the distribution constant the whole time? It seems that if you are at the starting line, then if everyone maintains the same speed throughout then you are Albert (the number who pass you at the start being equal to the number you pass at the start implies half of the distribution is above you and half below). The position of Betty just depends on the skewness of the distribution.
I don’t think this is the point. Suppose the distribution is set 1 minute into the race. At this point, you then start to look at who you pass and who passes you. With constant speed for everyone, no one passes you and you pass no one. But if the runners within the race start to change speed in such a way that still ensures the mean and median stay the same, then you can observe # passing = # passed by and still have no idea where you are. If the median is 5 and you run at 2.5, having the same # pass you and you pass the same # still does not ensure you are at the median. Same would go for you at 7.5.
So I guess it depends on when you start counting the number you pass and those that pass you by. I think this post is related to income distribution and not necessarily some big mathematical trick. I’ve been wrong before on this blog though.
No runners overtake any of us. It so happens that we start abreast. Some move ahead some I leave behind. But we all maintain a constant speed, so there is no change in the order.
Assumption A: we’re on a “small enough” circular track (or for a “long enough” amount of time).
Assumption B-1: We only count each other runner once, no matter how many times he passes us. Then I am running the median speed.
Assumption B-2: We count each runner as many times as we pass him. Then I am not necessarily running the median or the mean. For example: if Mo Farah is running 100 times as fast as me, and 99 other runners are running half my speed, then Mo will pass me 99 times in a lap, and I’ll pass all 99 other runners once a lap. But then the mean speed is 1.495 times greater than my speed, and the median speed is half my speed.
Maybe I’m running some sort of logarithmic or geometric mean?
(Note: Assumptions B-1 and B-2 are mutually exclusive, in case that isn’t obvious.)
Harold, you would overtake the slower runners eventually. If there were equal numbers of faster and slower runners than you, given enough time, the number of runners you pass would equal the number of runners who pass you.
Good set of assumptions in your first paragraph BTW.
We can only solve this puzzle subject to the number of simplifying assumptions we make.
If we are running on a very small loop such that we encounter every other runner at least once, then we only encounter runners who are NOT running the same speed we are.
Alternatively, there is a position problem: We may be running much faster than someone very far ahead of us, who we never have time to pass due to distance and/or time constraints.
Alternatively, we may be the median runner in our local cluster, but our local cluster may be an outlier with respect to the total population running at the lake.
Alternatively, perhaps every runner at the lake that day is running exactly the same speed, except one very slow person (who you pass) and one very fast person (who passes you).
There are too many unknowns. We cannot solve the problem.
@ Salim
For assumption B-2, I get myself running at the average speed (you undercounted the number runners in the “everyone else” group):
Taking your example
My speed = 2, Runner A’s speed = 200, Everyone else’s speed = 1.
Assume a lap length is 100 and all runners start at the same point.
At t = 100, A completes his 200th lap at the same time I complete my 2nd lap. He is passing me for the 198th time.
The other runners have just completed their first lap. I am passeing each of them for the first time.
Under assumption B-2, there must by 198 others, so the average speed is 2. I guess that makes me Betty.
“the number of runners you pass is exactly equal to the number of runners who pass you.”
This is underspecified. Pass when? In a short time period? Much then depends on the initial starting positions. If am the median of a staggered group far ahead of the rest I can get just this passing pattern and be faster or slower than the rest of the runners in hte main group.
I think people are having a lot of trouble translating the word problem into math.
When it says “you notice that the number of runners you pass is exactly equal to the number of runners who pass you,” the reasonable interpretation is that “the average number passing you equals the number you pass per period of time.” We are not making an observation about the whole population of runners, but simply about the number of passes that you “notice.” (And from there we are asked to derive a conclusion about the whole population.)
When it says there are “many” jogging around a “lakeshore,” you should take that to mean that the number of runners is large enough that the law of large numbers kicks in, and that the running track is large enough that we don’t have to worry about the same runner passing us multiple times during the time we have been counting passes.
From there, use the math in my third comment to prove the result.
I think this is clearly the interpretation Landsberg intended, although I agree it could be more clearly worded.
Ed: “When it says there are “many” jogging around a “lakeshore,” you should take that to mean that the number of runners is large enough that the law of large numbers kicks in, and that the running track is large enough that we don’t have to worry about the same runner passing us multiple times during the time we have been counting passes.”
I think this was my assumption. I assume that there are people continually joining and leaving the lakeshore, but the number of runners and their speed distribution stays about the same. Infinite time is not possible, but the duration is long enough to work out a rate of passing.
My qualitative assumptions led me to conclude that you are travelling at mean speed (Betty). Is that what your comment 3 says (s=E(x))?
I don’t know why you don’t just remember whether or not you passed Betty and Albert that’s only 2 bits of information which is much smaller than remembering the difference between the number of people you pass and the number who pass you
The small problem, it seems to me, with thinking of a circular track is that it suggests you’re noticing all runners (those you’re passing and those passing you). But here there are lots of runners, and you’re guessing based just those you meet in passing encounters. And [cue mental calculation….] the answer is: you’re moving at the avg. pace, and slower than Albert.
I am the slowest runner, and everyone started ahead of me. I pass noone, and noone passes me.
I will not attempt to answer until I know whether the track is circular or linear and whether runners enter the track at the same time or otherwise.
There seems to be an asymmetry in this problem.
Suppose the trail is not circular and has a length equal to L, and suppose you run with velocity V. If you start running at time 0, you’ll finish at time L/V.
If someone runs at velocity V+dV, then they have to start somewhere between t=0 and t=(L/V)-(L/(V+dV)) if they are going to pass you.
If someone runs at velocity V-dV, then they have to start somewhere between t=(L/V)-(L/(V-dV)) and t=0 if you are going to pass them.
If people start running “uniformly random” in time with “random” velocities, then it appears that you are more likely to pass someone with velocity V-dV than you are to be passed by someone with velocity V+dV because there is a bigger window of starting times for people you will pass. So if the number of runners you pass is equal to the number of runners who pass you, this may mean that you are running faster than Albert, but I haven’t proved this. (Though I think it is true for a distribution of velocities that is symmetric about the mean.)
Or, I could be confused . . .
This puzzle satisfies the Good Puzzle Theorem not by being interesting mathematically, but by being deliberately confusing semantically in an interesting way.
If he said “the number of runners who pass you (counting each runner once, no matter how many times they pass you) is…” the problem would be trivial, and the answer would be A.
If he said “the number of runners who pass you (counting each runner every time they pass you) is…” the problem would be moderately easy, and the answer would be B.
But he said “the number of runners who pass you (and I won’t say exactly what that means) is…”, so the semantic ambiguity leads to interesting discussions, irrefutable arguments that the answer must be A, others that it must be B, still more that it must be both even though these can be different, and even discussions on the Good Puzzle Theorem itself and how this puzzle satisfies it, and the puzzle’s similarity to other (simpler) puzzles.
In this sense (the sense that it relies on and is strengthened by the deliberate semantic ambiguity), it’s a similar puzzle to “I have 14 lumps of sugar, and 3 cups of tea. How can I put the 14 lumps into the 3 cups so that there is an odd number of lumps in each cup?”
@ThomasBayes – the asymmetry disappears (or almost), I suspect, if you consider log(V) as the randomly chosen variable, rather than V. In some sense, log(V) is a more natural choice for the variable of interest. It’s less scale dependent, for a start.
I am a rock.
I am an island.
I agree that we need more initial condition info, if not more constraints on people joining, leaving, changing speed, etc.
Suppose I gradually pass one person and one person gradually passes me. Also, a million people are a million miles ahead of us. They are running at slightly different speeds, either far faster or far slower than the three of us. Albert and Betty will always be among them.
I like Jonathan Kariv’s idea of subtracting average speed from everybody. He did it for a track, so I would like to think about it on the linear trail, although I think it is the same. This assumes a “steady state” of runners along the shore. Subtract the average speed from everyone, so the average speed is now zero. So the rate of people passing a stationary spot is the same forward and backwards. If the average faster runner is travelling faster than the average slower runner, there must be fewer faster runners, so the rate of passing remains the same. Since you observe the same number passing backwards as forwards, you must also be stationary, which means you are travelling at the mean speed. Since the same mean can be produced from many different distributions leading to many different medians, you can know nothing about the median runner.
@ Harold
For the linear track, I think this brings up another ambiguity in the problem. Do we average over all runners, or over runners currently on the track?
If at any given time, the average speed of people on the track is X, and you pass the same number of people as pass you, then I agree that your speed is X.
But faster runners spent less time on the track, so for the average speed of people currently on the track to be X, the average over all people must be greater than X.
Okay, it is a linear track. But the answer must depend on the initial conditions. As Ian West points out, if we all enter the track at the same time and place, we just steadily increase distance from each other and no one can learn anything about the central tendency of the distribution from the fact that he passes no one and no one passes him. I think we must assume that runners enter the track at random times and places uncorrelated to their running speeds, in which case I am the median runner (I think).
@nobody really: Hmmmm. I’ve heard the words before.
Now that the post has been edited to clarify that runners are running in a straight line and not a loop — it would seem that the question is still unanswerable because it doesn’t specify whether runners all start at the same time, where the runners start relative to each other, the minimum distance that anybody runs for, etc.
Because otherwise you could just enter the track, run for one meter, and quit, without having passed anyone or being passed by anyone, and you could accurately say that “the number of runners you passed is the same as the number who passed you” (zero), but obviously you can’t say anything about the speeds of the other runners.
More edits please :)
Ed is right. You can follow the math that he writes. To get an intuitive feel for it, read rhamdu’s comments about weighting due to differences in speed.
I think most people are getting mixed up by imagining a small number, like just four people or 15. Try imagining an infinitely long track with infinitely many runners. Imagine people running at three different speeds, the median, the mean, and the top speed. For example, imagine 60% of the people are running at 5mph (the median), 20% at 6.25mph, and 20% running at 10mph (.6*5+.2*6.25+.2*10=6.25). Right away you can see that no one travelling at the median speed is passing anyone, but other people are passing them. This immediately means that the median CANNOT be the answer.
To visualize what’s happening, imagine the people running at the three different speeds are kindly enough to run equidistant from each other and in rows (slow on the left, the mean in the middle, and the fast on the right). Let the slow people be spaced at one every meter, the mean at one every three meters, and the fast at one every three meters. This way the average percentage for every six meters of track is exactly 60% at 5mph, 20% at 6.25mph, and 20% at 10mph, no matter how long you let the experiment run and no matter where you look on the track. Now run the experiment. Consider a specific person in the middle row and consider whatever length of time you want to make the observation convenient and see how many people on the left are passed by this person in the middle and see how many people on the right pass that person in the middle.
You can actually better visualize this for any number of speeds that people might be running (including an infinite number of different speeds), rather than just three.
@Ken, I tried imagining an infinite track with an infinite number of runners, but then I couldn’t work out the sum to find the mean, nor the middle value. I then tried to use a probability density function, but I found that the track I had imagined didn’t support riemann or lebesgue integration. Then I chopped my imaginary track up into pieces and reassembled them, and ended up with two identical tracks. Help please!
Mike H,
“I chopped my imaginary track up into pieces and reassembled them, and ended up with two identical tracks.”
Ha!
You are the median runner. Suppose there are N>=3 runners who had a strictly staggered start. If,eventually and finally, you overtook K runners and K runners overtook you, where K \l.e.q. Floor[N/2], then your start position could not have been in the first or last K. So your start position was somewhere in the middle (K,N-K) positions. Now if you started at some position N_1 \in (K,N-K), after you have overtaken K people who were ahead of you and after you’ve been passed by K people, your position in the race remains N_1. So basically, your speed rank winds up being the same as your initial start position. Since your initial position was equally like to be in (K,N-K), you Expected[speed rank]=N/2 making you the median runner.
@Ken, Thanks!
I know I’m late, but this has a rather simple solution. There is only one interpretation of the puzzle that makes much sense, which is that it is talking about the overtaking rate in the limit of an infinite duration run around a loop. So I’m assuming that.
If two runners have speeds u,v then they overtake at a rate (u-v)/C where C is the length of the loop. Positive implies the first overtakes the second, negative is vice-versa. For the runner with speed u to have the same rate of overtakers as overtakees, the sum over all runners of the expression (u-v_i)/C must be zero. It is immediate that u = sum(v_i)/n